List of Flash News about LLM training
| Time | Details |
|---|---|
|
2026-02-03 21:49 |
FP8 Training on NVIDIA H100 Cuts Time to GPT-2 to 2.91 Hours and Drops Cost Near 20 Dollars, According to @karpathy
According to @karpathy, enabling FP8 training in the nanochat GPT-2 reproduction delivered a 4.3 percent improvement in time to GPT-2, reducing training to 2.91 hours on a single 8x H100 node. According to @karpathy, at spot pricing an 8x H100 run can cost about 20 dollars, while a previous 3.04 hour run cost about 73 dollars, highlighting roughly a 600 times cost reduction versus OpenAI’s original GPT-2 training. According to @karpathy, FP8 on H100 offers 2 times theoretical FLOPs but practical gains are limited by scaling conversion overhead, partial non compute bound training, and small GEMMs at GPT-2 scale, yielding about 7.3 percent per step speedup and roughly 5 percent net after adjusting the training horizon. According to @karpathy, torchao reported a 25 percent FP8 speedup on Llama3 8B, implying larger models may benefit more, and he expects further gains by selectively applying FP8 to layers and tightening numerics. According to @karpathy, additional wins came from Flash Attention 3, the Muon optimizer, gated residual and skip connections, and value embeddings, and he published a reproducible setup and a time to GPT-2 leaderboard on GitHub. |
|
2025-05-31 16:00 |
Researchers Achieve Breakthrough in LLM Training with 4-bit FP4 Precision, Boosting Crypto AI Efficiency
According to DeepLearning.AI, researchers have demonstrated that large language models (LLMs) can be trained using 4-bit FP4 precision for matrix multiplications, which account for 95% of training computation, without any loss of accuracy compared to the standard BF16 format. This breakthrough dramatically reduces computational requirements and hardware costs, potentially accelerating AI-powered blockchain and cryptocurrency analytics platforms by lowering entry barriers for decentralized AI projects (Source: DeepLearning.AI, May 31, 2025). |
|
2025-05-11 00:55 |
System Prompt Learning: The Emerging Paradigm in LLM Training and Its Crypto Market Implications
According to Andrej Karpathy on Twitter, a significant new paradigm—system prompt learning—is emerging in large language model (LLM) training, distinct from pretraining and fine-tuning methods (source: @karpathy, May 11, 2025). While pretraining builds foundational knowledge and fine-tuning shapes habitual behavior by altering model parameters, system prompt learning enables dynamic behavioral adaptation without changing parameters. For crypto traders, this development could accelerate AI-driven trading bots' adaptability to new market conditions, enhancing execution strategies and potentially impacting short-term volatility as AI trading tools become more responsive (source: @karpathy, May 11, 2025). |